Learning in Games via Reinforcement and Regularization

نویسندگان

  • Panayotis Mertikopoulos
  • William H. Sandholm
چکیده

We investigate a class of reinforcement learning dynamics in which players adjust their strategies based on their actions’ cumulative payoffs over time – specifically, by playing mixed strategies that maximize their expected cumulative payoff minus a strongly convex, regularizing penalty term. In contrast to the class of penalty functions used to define smooth best responses in models of stochastic fictitious play, the regularizers used in this paper need not be infinitely steep at the boundary of the simplex; in fact, dropping this requirement gives rise to an important dichotomy between steep and nonsteep cases. In this general setting, our main results extend several properties of the replicator dynamics such as the elimination of dominated strategies, the asymptotic stability of strict Nash equilibria and the convergence of timeaveraged trajectories to interior Nash equilibria in zero-sum games.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cycles in Adversarial Regularized Learning

Regularized learning is a fundamental technique in online optimization, machine learning and many other fields of computer science. A natural question that arises in these settings is how regularized learning algorithms behave when faced against each other. We study a natural formulation of this problem by coupling regularized learning dynamics in zero-sum games. We show that the system’s behav...

متن کامل

Joint Learning in Stochastic Games: Playing Coordination Games Within Coalitions

Despite the progress in multiagent reinforcement learning via formalisms based on stochastic games, these have difficulties coping with a high number of agents due to the combinatorial explosion in the number of joint actions. One possible way to reduce the complexity of the problem is to let agents form groups of limited size so that the number of the joint actions is reduced. This paper inves...

متن کامل

L1 Regularized Linear Temporal Difference Learning

Several recent efforts in the field of reinforcement learning have focused attention on the importance of regularization, but the techniques for incorporating regularization into reinforcement learning algorithms, and the effects of these changes upon the convergence of these algorithms, are ongoing areas of research. In particular, little has been written about the use of regularization in onl...

متن کامل

Learning through reinforcement for N-person repeated constrained games

The design and analysis of an adaptive strategy for N-person averaged constrained stochastic repeated game are addressed. Each player is modeled by a stochastic variable-structure learning automaton. Some constraints are imposed on some functions of the probabilities governing the selection of the player's actions. After each stage, the payoff to each player as well as the constraints are rando...

متن کامل

MazeBase: A Sandbox for Learning from Games

This paper introduces MazeBase: an environment for simple 2D games, designed as a sandbox for machine learning approaches to reasoning and planning. Within it, we create 10 simple games embodying a range of algorithmic tasks (e.g. if-then statements or set negation). A variety of neural models (fully connected, convolutional network, memory network) are deployed via reinforcement learning on th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Math. Oper. Res.

دوره 41  شماره 

صفحات  -

تاریخ انتشار 2016